Hierarchical Multi-Attention Transfer for Knowledge Distillation
نویسندگان
چکیده
Knowledge distillation (KD) is a powerful and widely applicable technique for the compression of deep learning models. The main idea knowledge to transfer from large teacher model small student model, where attention mechanism has been intensively explored in regard its great flexibility managing different teacher-student architectures. However, existing attention-based methods usually similar intermediate layers neural networks, leaving hierarchical structure representation poorly investigated distillation. In this paper, we propose multi-attention framework (HMAT), types are utilized at levels Specifically, position-based channel-based characterize low-level high-level feature representations respectively, activation-based both mid-level representations. Extensive experiments on three popular visual recognition tasks, image classification, retrieval, object detection, demonstrate that proposed or HMAT significantly outperforms recent state-of-the-art KD methods.
منابع مشابه
Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents
This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...
متن کاملhierarchical functional concepts for knowledge transfer among reinforcement learning agents
this article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for reinforcement learning agents. these definitions are used as a tool of knowledge transfer among agents. the agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. in other words, the agents are assumed t...
متن کاملMulti-Resolution Learning for Knowledge Transfer
Related objects may look similar at low-resolutions; differences begin to emerge naturally as the resolution is increased. By learning across multiple resolutions of input, knowledge can be transfered between related objects. My dissertation develops this idea and applies it to the problem of multitask transfer learning.
متن کاملKnowledge Transfer for Multi-labeler Active Learning
In this paper, we address multi-labeler active learning, where data labels can be acquired from multiple labelers with various levels of expertise. Because obtaining labels for data instances can be very costly and time-consuming, it is highly desirable to model each labeler’s expertise and only to query an instance’s label from the labeler with the best expertise. However, in an active learnin...
متن کاملHierarchical Multi-scale Attention Networks for action recognition
Recurrent Neural Networks (RNNs) have been widely used in natural language processing and computer vision. Among them, the Hierarchical Multi-scale RNN (HM-RNN), a kind of multi-scale hierarchical RNN proposed recently, can learn the hierarchical temporal structure from data automatically. In this paper, we extend the work to solve the computer vision task of action recognition. However, in seq...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Multimedia Computing, Communications, and Applications
سال: 2022
ISSN: ['1551-6857', '1551-6865']
DOI: https://doi.org/10.1145/3568679